Search results for "ZIPF'S LAW"

showing 6 items of 6 documents

Numerical Analysis of Word Frequencies in Artificial and Natural Language Texts

1997

We perform a numerical study of the statistical properties of natural texts written in English and of two types of artificial texts. As statistical tools we use the conventional Zipf analysis of the distribution of words and the inverse Zipf analysis of the distribution of frequencies of words, the analysis of vocabulary growth, the Shannon entropy and a quantity which is a nonlinear function of frequencies of words, the frequency "entropy". Our numerical results, obtained by investigation of eight complete books and sixteen related artificial texts, suggest that, among these analyses, the analysis of vocabulary growth shows the most striking difference between natural and artificial texts…

VocabularyZipf's lawbusiness.industryApplied Mathematicsmedia_common.quotation_subjectNumerical analysisInversecomputer.software_genreWord lists by frequencyModeling and SimulationEntropy (information theory)Geometry and TopologyArtificial intelligencebusinesscomputerNatural language processingNatural languageMathematicsmedia_commonFractals
researchProduct

Zipf’s Law and World Income Distribution

2008

The aim of this article is to demonstrate regularity in the world income distribution. In particular, using GDP per capita data for the period 1980 to 2004, the article shows that the world income distribution follows the well know 'rank-size rule'.

Zipf's law
researchProduct

Comparison of MeSH terms and KeyWords Plus terms for more accurate classification in medical research fields. A case study in cannabis research

2021

Abstract KeyWords Plus and Medical Subject Headings (MeSH) are widely used in bibliometric studies for topic mapping. The objective of this study is to compare the two description systems in documents about cannabis research to find the concordance between systems and establish whether there is neutrality in topic mapping. A total of 25,593 articles from 1970 to 2019 were drawn from Web of Science's Core Collection and Medline and analyzed. The tidytext library, Zipf's law, topic modeling tools, the contingency coefficient, Cramer's V, and Cohen's kappa were used. The results included 10,107 MeSH terms and 28,870 KeyWords Plus terms. The Zipf distribution of the terms was different for each…

Topic modelContingency tableInformation retrievalZipf's lawComputer scienceConcordanceMEDLINESubject (documents)Library and Information SciencesManagement Science and Operations ResearchComputer Science ApplicationsCohen's kappaMedia TechnologyKappaInformation SystemsInformation Processing & Management
researchProduct

Pareto or log-normal? A recursive-truncation approach to the distribution of (all) cities

2012

Traditionally, it is assumed that the population size of cities in a country follows a Pareto distribution. This assumption is typically supported by finding evidence of Zipf's Law. Recent studies question this finding, highlighting that, while the Pareto distribution may fit reasonably well when the data is truncated at the upper tail, i.e. for the largest cities of a country, the log-normal distribution may apply when all cities are considered. Moreover, conclusions may be sensitive to the choice of a particular truncation threshold, a yet overlooked issue in the literature. In this paper, then, we reassess the city size distribution in relation to its sensitivity to the choice of truncat…

jel:D30jel:C46jel:R12City size distribution; Pareto and Log-normal; Zipf's Law; Kolmogorov- Smirnov; Recursive analysis
researchProduct

The 'power' of tourism in Portugal

2012

The author analyses the upper tail of the distribution of tourism supply in Portugal from 2002 to 2009, using data from the Instituto Nacional de Estatística database. Tourism supply is defined in terms of the lodging capacity of hotel establishments in about 250 tourist destinations. The paper shows that the empirical distribution of tourism supply in Portugal is heavy-tailed and consistent with a power law behaviour in its upper tail. Such behaviour seems to be stable over the years, provided that, for the time horizon covered by the data sets, the scaling parameter is always close to the value of two. The power law hypothesis is tested positively through the use of graphical and analyti…

Zipf's lawbusiness.industrypower law behaviorGeography Planning and DevelopmentDistribution (economics)Time horizonDestinationsHEAVY-TAILED DISTRIBUTIONPOWER LAW BEHAVIOUREmpirical distribution functionZIPF'S LAWTOURISM SUPPLYEconomySettore SECS-S/06 -Metodi Mat. dell'Economia e d. Scienze Attuariali e Finanz.Heavy-tailed distributionTourism Leisure and Hospitality ManagementValue (economics)EconomicsRegional sciencescaling parameterbusinessTourismtourism supply distribution
researchProduct

Analiza zależności pomiędzy pozycją w rankingu "Diamenty Forbesa" a wzrostem wartości przedsiębiorstwa

2016

Position of companies in various rankings is dependent on many factors, often it is difficult to satisfy all requirements at the same time that the occupied position was one of the best. The analysis of the size of enter- prises in various countries confirms the trend: there is a lot of the small enterprises, and the big ones are the minority. Considering the rankings of firms by size of assets Zipf justified the relationship between the position in the ranking and the size of these assets. The aim of the article was to determine the relationship between the growth of the company and the position in the ranking. Studies have shown that it is possible to describe the dependence of growth ent…

rankingZipf's lawenterprise value
researchProduct